82 research outputs found
ModuleDigger: an itemset mining framework for the detection of cis-regulatory modules
Background: The detection of cis-regulatory modules (CRMs) that mediate transcriptional responses in eukaryotes remains a key challenge in the postgenomic era. A CRM is characterized by a set of co-occurring transcription factor binding sites (TFBS). In silico methods have been developed to search for CRMs by determining the combination of TFBS that are statistically overrepresented in a certain geneset. Most of these methods solve this combinatorial problem by relying on computational intensive optimization methods. As a result their usage is limited to finding CRMs in small datasets (containing a few genes only) and using binding sites for a restricted number of transcription factors (TFs) out of which the optimal module will be selected.
Results: We present an itemset mining based strategy for computationally detecting cis-regulatory modules (CRMs) in a set of genes. We tested our method by applying it on a large benchmark data set, derived from a ChIP-Chip analysis and compared its performance with other well known cis-regulatory module detection tools.
Conclusion: We show that by exploiting the computational efficiency of an itemset mining approach and combining it with a well-designed statistical scoring scheme, we were able to prioritize the biologically valid CRMs in a large set of coregulated genes using binding sites for a large number of potential TFs as input
Recommended from our members
DNA replication determines timing of mitosis by restricting CDK1 and PLK1 activation
To maintain genome stability, cells need to replicate their DNA before dividing. The kinases CDK1 and PLK1 drive mitotic entry and become active when bulk DNA synthesis is completed at the S/G2 transition. Here, we have tested the hypothesis that DNA replication controls activation of mitotic kinases. Using an optimized double-degron system, we find that human cells unable to initiate DNA replication in S-phase promptly activate CDK1 and PLK1 and prematurely enter mitosis. In the presence of DNA replication, inhibition of CHK1 and p38 leads to premature activation of CDK1 and PLK1. While CDK2 activity promotes DNA replication, activation of CDK1 in S-phase induces severe replication stress. We propose that mitotic kinase activation is governed by a CDK2- and DNA replication-dependent feed-forward loop that ensures timely cell division while preserving genome stability. DNA replication thus functions as a break that coordinates cell cycle activities and determines cell cycle duration
In silico identification and experimental validation of PmrAB targets in Salmonella typhimurium by regulatory motif detection
BACKGROUND: The PmrAB (BasSR) two-component regulatory system is required for Salmonella typhimurium virulence. PmrAB-controlled modifications of the lipopolysaccharide (LPS) layer confer resistance to cationic antibiotic polypeptides, which may allow bacteria to survive within macrophages. The PmrAB system also confers resistance to Fe(3+)-mediated killing. New targets of the system have recently been discovered that seem not to have a role in the well-described functions of PmrAB, suggesting that the PmrAB-dependent regulon might contain additional, unidentified targets. RESULTS: We performed an in silico analysis of possible targets of the PmrAB system. Using a motif model of the PmrA binding site in DNA, genome-wide screening was carried out to detect PmrAB target genes. To increase confidence in the predictions, all putative targets were subjected to a cross-species comparison (phylogenetic footprinting) using a Gibbs sampling-based motif-detection procedure. As well as the known targets, we detected additional targets with unknown functions. Four of these were experimentally validated (yibD, aroQ, mig-13 and sseJ). Site-directed mutagenesis of the PmrA-binding site (PmrA box) in yibD revealed specific sequence requirements. CONCLUSIONS: We demonstrated the efficiency of our procedure by recovering most of the known PmrAB-dependent targets and by identifying unknown targets that we were able to validate experimentally. We also pinpointed directions for further research that could help elucidate the S. typhimurium virulence pathway
Inferring transcriptional modules from ChIP-chip, motif and microarray data
'ReMoDiscovery' is an intuitive algorithm to correlate regulatory programs with regulators and corresponding motifs to a set of co-expressed genes. It exploits in a concurrent way three independent data sources: ChIP-chip data, motif information and gene expression profiles. When compared to published module discovery algorithms, ReMoDiscovery is fast and easily tunable. We evaluated our method on yeast data, where it was shown to generate biologically meaningful findings and allowed the prediction of potential novel roles of transcriptional regulators
DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli
DISTILLER, a data integration framework for the inference of transcriptional module networks, is presented and used to investigate the condition dependency and modularity in Escherichia coli networks
Broad clinical phenotypes associated with TAR-DNA binding protein (TARDBP) mutations in amyotrophic lateral sclerosis
The finding of TDP-43 as a major component of ubiquitinated protein inclusions in amyotrophic lateral sclerosis (ALS) has led to the identification of 30 mutations in the transactive response-DNA binding protein (TARDBP) gene, encoding TDP-43. All but one are in exon 6, which encodes the glycine-rich domain. The aim of this study was to determine the frequency of TARDBP mutations in a large cohort of motor neurone disease patients from Northern England (42 non-superoxide dismutase 1 (SOD1) familial ALS (FALS), nine ALS-frontotemporal dementia, 474 sporadic ALS (SALS), 45 progressive muscular atrophy cases). We identified four mutations, two of which were novel, in two familial (FALS) and two sporadic (SALS) cases, giving a frequency of TARDBP mutations in non-SOD1 FALS of 5% and SALS of 0.4%. Analysis of clinical data identified that patients had typical ALS, with limb or bulbar onset, and showed considerable variation in age of onset and rapidity of disease course. However, all cases had an absence of clinically overt cognitive dysfunction
Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke
Genetic factors have been implicated in stroke risk but few replicated associations have been reported. We conducted a genome-wide association study (GWAS) in ischemic stroke and its subtypes in 3,548 cases and 5,972 controls, all of European ancestry. Replication of potential
signals was performed in 5,859 cases and 6,281 controls. We replicated reported associations between variants close to PITX2 and ZFHX3 with cardioembolic stroke, and a 9p21 locus with large vessel stroke. We identified a novel association for a SNP within the histone deacetylase 9(HDAC9) gene on chromosome 7p21.1 which was associated with large vessel stroke including additional replication in a further 735 cases and 28583 controls (rs11984041, combined P =
1.87×10−11, OR=1.42 (95% CI) 1.28-1.57). All four loci exhibit evidence for heterogeneity of effect across the stroke subtypes, with some, and possibly all, affecting risk for only one subtype. This suggests differing genetic architectures for different stroke subtypes
COLOMBOS: Access Port for Cross-Platform Bacterial Expression Compendia
Background: Microarrays are the main technology for large-scale transcriptional gene expression profiling, but the large bodies of data available in public databases are not useful due to the large heterogeneity. There are several initiatives that attempt to bundle these data into expression compendia, but such resources for bacterial organisms are scarce and limited to integration of experiments from the same platform or to indirect integration of per experiment analysis results.
Methodology/Principal Findings: We have constructed comprehensive organism-specific cross-platform expression compendia for three bacterial model organisms (Escherichia coli, Bacillus subtilis, and Salmonella enterica serovar Typhimurium) together with an access portal, dubbed COLOMBOS, that not only provides easy access to the compendia, but also includes a suite of tools for exploring, analyzing, and visualizing the data within these compendia. It is freely available at http://bioi.biw.kuleuven.be/colombos. The compendia are unique in directly combining expression information from different microarray platforms and experiments, and we illustrate the potential benefits of this direct integration with a case study: extending the known regulon of the Fur transcription factor of E. coli. The compendia also incorporate extensive annotations for both genes and experimental conditions; these heterogeneous data are functionally integrated in the COLOMBOS analysis tools to interactively browse and query the compendia not only for specific genes or experiments, but also metabolic pathways, transcriptional regulation mechanisms, experimental conditions, biological processes, etc.
Conclusions/Significance: We have created cross-platform expression compendia for several bacterial organisms and developed a complementary access port COLOMBOS, that also serves as a convenient expression analysis tool to extract useful biological information. This work is relevant to a large community of microbiologists by facilitating the use of publicly available microarray experiments to support their research
- …